-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[clang] Limit lifetimes of temporaries to the full expression #170517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
|
@llvm/pr-subscribers-clang-codegen @llvm/pr-subscribers-clang Author: Paul Kirth (ilovepi) ChangesWe have several issues describing suboptimal stack usage related to the Previously, https://reviews.llvm.org/D74094 tried to address this. In Fixes #68747 Co-authored-by: Nick Desaulniers <[email protected]> Full diff: https://github.com/llvm/llvm-project/pull/170517.diff 9 Files Affected:
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 9f8d781c93021..de66b809315a3 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -86,6 +86,15 @@ Potentially Breaking Changes
options-related code has been moved out of the Driver into a separate library.
- The ``clangFrontend`` library no longer depends on ``clangDriver``, which may
break downstream projects that relied on this transitive dependency.
+- Clang is now more precise with regards to the lifetime of temporary objects
+ such as when aggregates are passed by value to a function, resulting in
+ better sharing of stack slots and reduced stack usage. This change can lead
+ to use-after-scope related issues in code that unintentionally relied on the
+ previous behavior. If recompiling with ``-fsanitize=address`` shows a
+ use-after-scope warning, then this is likely the case, and the report printed
+ should be able to help users pinpoint where the use-after-scope is occurring.
+ Users can use ``-Xclang -sloppy-temporary-lifetimes`` to retain the old
+ behavior until they are able to find and resolve issues in their code.
C/C++ Language Potentially Breaking Changes
-------------------------------------------
diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def
index 76a6463881c6f..e7f5b4c9a08a9 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -475,6 +475,10 @@ ENUM_CODEGENOPT(ZeroCallUsedRegs, ZeroCallUsedRegsKind,
/// non-deleting destructors. (No effect on Microsoft ABI.)
CODEGENOPT(CtorDtorReturnThis, 1, 0, Benign)
+/// Set via -Xclang -sloppy-temporary-lifetimes to disable emission of lifetime
+/// marker intrinsic calls.
+CODEGENOPT(NoLifetimeMarkersForTemporaries, 1, 0, Benign)
+
/// Enables emitting Import Call sections on supported targets that can be used
/// by the Windows kernel to enable import call optimization.
CODEGENOPT(ImportCallOptimization, 1, 0, Benign)
diff --git a/clang/include/clang/Options/Options.td b/clang/include/clang/Options/Options.td
index 756d6deed7130..b5ed93d5a4d39 100644
--- a/clang/include/clang/Options/Options.td
+++ b/clang/include/clang/Options/Options.td
@@ -8135,6 +8135,10 @@ def import_call_optimization : Flag<["-"], "import-call-optimization">,
def replaceable_function: Joined<["-"], "loader-replaceable-function=">,
MarshallingInfoStringVector<CodeGenOpts<"LoaderReplaceableFunctionNames">>;
+def sloppy_temporary_lifetimes : Flag<["-"], "sloppy-temporary-lifetimes">,
+ HelpText<"Don't emit lifetime markers for temporary objects">,
+ MarshallingInfoFlag<CodeGenOpts<"NoLifetimeMarkersForTemporaries">>;
+
} // let Visibility = [CC1Option]
//===----------------------------------------------------------------------===//
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index efacb3cc04c01..92fe954fddcff 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -43,6 +43,7 @@
#include "llvm/IR/IntrinsicInst.h"
#include "llvm/IR/Intrinsics.h"
#include "llvm/IR/Type.h"
+#include "llvm/Support/TypeSize.h"
#include "llvm/Transforms/Utils/Local.h"
#include <optional>
using namespace clang;
@@ -4947,7 +4948,23 @@ void CodeGenFunction::EmitCallArg(CallArgList &args, const Expr *E,
return;
}
- args.add(EmitAnyExprToTemp(E), type);
+ AggValueSlot ArgSlot = AggValueSlot::ignored();
+ // If the callee returns a reference, skip this stack saving optimization;
+ // we don't want to prematurely end the lifetime of the temporary. It may be
+ // possible to still perform this optimization if the return type is a
+ // reference to a different type than the parameter.
+ if (hasAggregateEvaluationKind(E->getType())) {
+ RawAddress ArgSlotAlloca = Address::invalid();
+ ArgSlot = CreateAggTemp(E->getType(), "agg.tmp", &ArgSlotAlloca);
+
+ // Emit a lifetime start/end for this temporary at the end of the full
+ // expression.
+ if (!CGM.getCodeGenOpts().NoLifetimeMarkersForTemporaries &&
+ EmitLifetimeStart(ArgSlotAlloca.getPointer()))
+ pushFullExprCleanup<CallLifetimeEnd>(NormalAndEHCleanup, ArgSlotAlloca);
+ }
+
+ args.add(EmitAnyExpr(E, ArgSlot), type);
}
QualType CodeGenFunction::getVarArgType(const Expr *Arg) {
diff --git a/clang/test/CodeGen/lifetime-call-temp.c b/clang/test/CodeGen/lifetime-call-temp.c
new file mode 100644
index 0000000000000..3bc68b5e8024a
--- /dev/null
+++ b/clang/test/CodeGen/lifetime-call-temp.c
@@ -0,0 +1,98 @@
+// RUN: %clang -cc1 -triple x86_64-apple-macos -O1 -disable-llvm-passes %s \
+// RUN: -emit-llvm -o - | FileCheck %s --implicit-check-not=llvm.lifetime
+// RUN: %clang -cc1 -xc++ -std=c++17 -triple x86_64-apple-macos -O1 \
+// RUN: -disable-llvm-passes %s -emit-llvm -o - -Wno-return-type-c-linkage | \
+// RUN: FileCheck %s --implicit-check-not=llvm.lifetime --check-prefixes=CHECK,CXX
+// RUN: %clang -cc1 -xobjective-c -triple x86_64-apple-macos -O1 \
+// RUN: -disable-llvm-passes %s -emit-llvm -o - | \
+// RUN: FileCheck %s --implicit-check-not=llvm.lifetime --check-prefixes=CHECK,OBJC
+// RUN: %clang -cc1 -triple x86_64-apple-macos -O1 -disable-llvm-passes %s \
+// RUN: -emit-llvm -o - -sloppy-temporary-lifetimes | \
+// RUN: FileCheck %s --implicit-check-not=llvm.lifetime --check-prefixes=SLOPPY
+
+typedef struct { int x[100]; } aggregate;
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+void takes_aggregate(aggregate);
+aggregate gives_aggregate();
+
+// CHECK-LABEL: define void @t1
+void t1() {
+ takes_aggregate(gives_aggregate());
+
+ // CHECK: [[AGGTMP:%.*]] = alloca %struct.aggregate, align 8
+ // CHECK: call void @llvm.lifetime.start.p0(ptr [[AGGTMP]])
+ // CHECK: call void{{.*}} @gives_aggregate(ptr{{.*}}sret(%struct.aggregate) align 4 [[AGGTMP]])
+ // CHECK: call void @takes_aggregate(ptr noundef byval(%struct.aggregate) align 8 [[AGGTMP]])
+ // CHECK: call void @llvm.lifetime.end.p0(ptr [[AGGTMP]])
+
+ // SLOPPY: [[AGGTMP:%.*]] = alloca %struct.aggregate, align 8
+ // SLOPPY-NEXT: call void (ptr, ...) @gives_aggregate(ptr{{.*}}sret(%struct.aggregate) align 4 [[AGGTMP]])
+ // SLOPPY-NEXT: call void @takes_aggregate(ptr noundef byval(%struct.aggregate) align 8 [[AGGTMP]])
+}
+
+// CHECK: declare {{.*}}llvm.lifetime.start
+// CHECK: declare {{.*}}llvm.lifetime.end
+
+#ifdef __cplusplus
+// CXX: define void @t2
+void t2() {
+ struct S {
+ S(aggregate) {}
+ };
+ S{gives_aggregate()};
+
+ // CXX: [[AGG:%.*]] = alloca %struct.aggregate
+ // CXX: call void @llvm.lifetime.start.p0(ptr [[AGG]]
+ // CXX: call void @gives_aggregate(ptr{{.*}}sret(%struct.aggregate) align 4 [[AGG]])
+ // CXX: call void @_ZZ2t2EN1SC1E9aggregate(ptr {{.*}}, ptr {{.*}} byval(%struct.aggregate) align 8 [[AGG]])
+ // CXX: call void @llvm.lifetime.end.p0(ptr [[AGG]]
+}
+
+struct Dtor {
+ ~Dtor();
+};
+
+void takes_dtor(Dtor);
+Dtor gives_dtor();
+
+// CXX: define void @t3
+void t3() {
+ takes_dtor(gives_dtor());
+
+ // CXX: [[AGG:%.*]] = alloca %struct.Dtor
+ // CXX: call void @llvm.lifetime.start.p0(ptr [[AGG]])
+ // CXX: call void @gives_dtor(ptr{{.*}}sret(%struct.Dtor) align 1 [[AGG]])
+ // CXX: call void @takes_dtor(ptr noundef [[AGG]])
+ // CXX: call void @_ZN4DtorD1Ev(ptr {{.*}} [[AGG]])
+ // CXX: call void @llvm.lifetime.end.p0(ptr [[AGG]])
+ // CXX: ret void
+}
+
+#endif
+
+#ifdef __OBJC__
+
+@interface X
+-m:(aggregate)x;
+@end
+
+// OBJC: define void @t4
+void t4(X *x) {
+ [x m: gives_aggregate()];
+
+ // OBJC: [[AGG:%.*]] = alloca %struct.aggregate
+ // OBJC: call void @llvm.lifetime.start.p0(ptr [[AGG]]
+ // OBJC: call void{{.*}} @gives_aggregate(ptr{{.*}}sret(%struct.aggregate) align 4 [[AGGTMP]])
+ // OBJC: call {{.*}}@objc_msgSend
+ // OBJC: call void @llvm.lifetime.end.p0(ptr [[AGG]]
+}
+
+#endif
+
+#ifdef __cplusplus
+}
+#endif
diff --git a/clang/test/CodeGen/stack-usage-lifetimes.c b/clang/test/CodeGen/stack-usage-lifetimes.c
new file mode 100644
index 0000000000000..3787a29e4ce7d
--- /dev/null
+++ b/clang/test/CodeGen/stack-usage-lifetimes.c
@@ -0,0 +1,89 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog %s -verify=x86-precise
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog -sloppy-temporary-lifetimes %s -verify=x86-sloppy
+
+// RUN: %clang_cc1 -triple aarch64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog %s -verify=aarch64-precise
+// RUN: %clang_cc1 -triple aarch64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog -sloppy-temporary-lifetimes %s -verify=aarch64-sloppy
+
+// RUN: %clang_cc1 -triple riscv64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog %s -verify=riscv-precise
+// RUN: %clang_cc1 -triple riscv64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog -sloppy-temporary-lifetimes %s -verify=riscv-sloppy
+
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog %s -verify=x86-precise -xc++
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog -sloppy-temporary-lifetimes %s -verify=x86-sloppy -xc++
+
+// RUN: %clang_cc1 -triple aarch64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog %s -verify=aarch64-precise -xc++
+// RUN: %clang_cc1 -triple aarch64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog -sloppy-temporary-lifetimes %s -verify=aarch64-sloppy -xc++
+
+// RUN: %clang_cc1 -triple riscv64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog %s -verify=riscv-precise -xc++
+// RUN: %clang_cc1 -triple riscv64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog -sloppy-temporary-lifetimes %s -verify=riscv-sloppy -xc++
+
+
+typedef struct { char x[32]; } A;
+typedef struct { char *w, *x, *y, *z; } B;
+
+void useA(A);
+void useB(B);
+A genA(void);
+B genB(void);
+
+void t1(int c) {
+ // x86-precise-remark@-1 {{40 stack bytes}}
+ // x86-sloppy-remark@-2 {{72 stack bytes}}
+ // aarch64-precise-remark@-3 {{48 stack bytes}}
+ // aarch64-sloppy-remark@-4 {{80 stack bytes}}
+ // riscv-precise-remark@-5 {{48 stack bytes}}
+ // riscv-sloppy-remark@-6 {{80 stack bytes}}
+
+ if (c)
+ useA(genA());
+ else
+ useA(genA());
+}
+
+void t2(void) {
+ // x86-precise-remark@-1 {{72 stack bytes}}
+ // x86-sloppy-remark@-2 {{72 stack bytes}}
+ // aarch64-precise-remark@-3 {{80 stack bytes}}
+ // aarch64-sloppy-remark@-4 {{80 stack bytes}}
+ // riscv-precise-remark@-5 {{80 stack bytes}}
+ // riscv-sloppy-remark@-6 {{80 stack bytes}}
+
+ useA(genA());
+ useA(genA());
+}
+
+void t3(void) {
+ // x86-precise-remark@-1 {{72 stack bytes}}
+ // x86-sloppy-remark@-2 {{72 stack bytes}}
+ // aarch64-precise-remark@-3 {{80 stack bytes}}
+ // aarch64-sloppy-remark@-4 {{80 stack bytes}}
+ // riscv-precise-remark@-5 {{80 stack bytes}}
+ // riscv-sloppy-remark@-6 {{80 stack bytes}}
+
+ useB(genB());
+ useB(genB());
+}
+
+#ifdef __cplusplus
+struct C {
+ char x[24];
+ char *ptr;
+ ~C() {};
+};
+
+void useC(C);
+C genC(void);
+
+// This case works in C++, since its AST is structured slightly differently
+// than it is in C (CompundStmt/ExprWithCleanup/CallExpr vs CompundStmt/CallExpr).
+void t4() {
+ // x86-precise-remark@-1 {{40 stack bytes}}
+ // x86-sloppy-remark@-2 {{72 stack bytes}}
+ // aarch64-precise-remark@-3 {{48 stack bytes}}
+ // aarch64-sloppy-remark@-4 {{80 stack bytes}}
+ // riscv-precise-remark@-5 {{48 stack bytes}}
+ // riscv-sloppy-remark@-6 {{80 stack bytes}}
+
+ useC(genC());
+ useC(genC());
+}
+#endif
diff --git a/clang/test/CodeGenCXX/amdgcn-call-with-aggarg.cc b/clang/test/CodeGenCXX/amdgcn-call-with-aggarg.cc
new file mode 100644
index 0000000000000..9b598a48f6436
--- /dev/null
+++ b/clang/test/CodeGenCXX/amdgcn-call-with-aggarg.cc
@@ -0,0 +1,19 @@
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -emit-llvm -O3 -disable-llvm-passes -o - %s | FileCheck %s
+
+struct A {
+ float x, y, z, w;
+};
+
+void foo(A a);
+
+// CHECK-LABEL: @_Z4testv
+// CHECK: [[A:%.*]] = alloca [[STRUCT_A:%.*]], align 4, addrspace(5)
+// CHECK-NEXT: [[AGG_TMP:%.*]] = alloca [[STRUCT_A]], align 4, addrspace(5)
+// CHECK-NEXT: [[A_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[A]] to ptr
+// CHECK-NEXT: [[AGG_TMP_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[AGG_TMP]] to ptr
+// CHECK-NEXT: call void @llvm.lifetime.start.p5(i64 16, ptr addrspace(5) [[A]]) #[[ATTR4:[0-9]+]]
+// CHECK-NEXT: call void @llvm.lifetime.start.p5(i64 16, ptr addrspace(5) [[AGG_TMP]]) #[[ATTR4]]
+void test() {
+ A a;
+ foo(a);
+}
diff --git a/clang/test/CodeGenCXX/stack-reuse-miscompile.cpp b/clang/test/CodeGenCXX/stack-reuse-miscompile.cpp
index 67fa9f9c9cd98..50c374d2710f4 100644
--- a/clang/test/CodeGenCXX/stack-reuse-miscompile.cpp
+++ b/clang/test/CodeGenCXX/stack-reuse-miscompile.cpp
@@ -26,6 +26,8 @@ const char * f(S s)
// CHECK: [[T2:%.*]] = alloca %class.T, align 4
// CHECK: [[T3:%.*]] = alloca %class.T, align 4
//
+// CHECK: [[AGG:%.*]] = alloca %class.S, align 4
+//
// FIXME: We could defer starting the lifetime of the return object of concat
// until the call.
// CHECK: call void @llvm.lifetime.start.p0(ptr [[T1]])
@@ -34,10 +36,12 @@ const char * f(S s)
// CHECK: [[T4:%.*]] = call noundef ptr @_ZN1TC1EPKc(ptr {{[^,]*}} [[T2]], ptr noundef @.str)
//
// CHECK: call void @llvm.lifetime.start.p0(ptr [[T3]])
+// CHECK: call void @llvm.lifetime.start.p0(ptr [[AGG]])
// CHECK: [[T5:%.*]] = call noundef ptr @_ZN1TC1E1S(ptr {{[^,]*}} [[T3]], [2 x i32] %{{.*}})
//
// CHECK: call void @_ZNK1T6concatERKS_(ptr dead_on_unwind writable sret(%class.T) align 4 [[T1]], ptr {{[^,]*}} [[T2]], ptr noundef nonnull align 4 dereferenceable(16) [[T3]])
// CHECK: [[T6:%.*]] = call noundef ptr @_ZNK1T3strEv(ptr {{[^,]*}} [[T1]])
+// CHECK: call void @llvm.lifetime.end.p0(ptr [[AGG]])
//
// CHECK: call void @llvm.lifetime.end.p0(
// CHECK: call void @llvm.lifetime.end.p0(
diff --git a/clang/test/CodeGenCoroutines/pr59181.cpp b/clang/test/CodeGenCoroutines/pr59181.cpp
index 21e784e0031de..a68a61984f981 100644
--- a/clang/test/CodeGenCoroutines/pr59181.cpp
+++ b/clang/test/CodeGenCoroutines/pr59181.cpp
@@ -49,6 +49,7 @@ void foo() {
}
// CHECK: cleanup.cont:{{.*}}
+// CHECK-NEXT: call void @llvm.lifetime.start.p0(ptr [[AGG:%agg.tmp]])
// CHECK-NEXT: load i8
// CHECK-NEXT: trunc
// CHECK-NEXT: store i1 false
@@ -57,3 +58,6 @@ void foo() {
// CHECK-NOT: call void @llvm.lifetime
// CHECK: call void @llvm.coro.await.suspend.void(
// CHECK-NEXT: %{{[0-9]+}} = call i8 @llvm.coro.suspend(
+
+// CHECK-LABEL: cond.end:
+// check call @llvm.lifetime.end.p0(ptr [[AGG]])
|
|
@llvm/pr-subscribers-coroutines Author: Paul Kirth (ilovepi) ChangesWe have several issues describing suboptimal stack usage related to the Previously, https://reviews.llvm.org/D74094 tried to address this. In Fixes #68747 Co-authored-by: Nick Desaulniers <[email protected]> Full diff: https://github.com/llvm/llvm-project/pull/170517.diff 9 Files Affected:
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 9f8d781c93021..de66b809315a3 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -86,6 +86,15 @@ Potentially Breaking Changes
options-related code has been moved out of the Driver into a separate library.
- The ``clangFrontend`` library no longer depends on ``clangDriver``, which may
break downstream projects that relied on this transitive dependency.
+- Clang is now more precise with regards to the lifetime of temporary objects
+ such as when aggregates are passed by value to a function, resulting in
+ better sharing of stack slots and reduced stack usage. This change can lead
+ to use-after-scope related issues in code that unintentionally relied on the
+ previous behavior. If recompiling with ``-fsanitize=address`` shows a
+ use-after-scope warning, then this is likely the case, and the report printed
+ should be able to help users pinpoint where the use-after-scope is occurring.
+ Users can use ``-Xclang -sloppy-temporary-lifetimes`` to retain the old
+ behavior until they are able to find and resolve issues in their code.
C/C++ Language Potentially Breaking Changes
-------------------------------------------
diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def
index 76a6463881c6f..e7f5b4c9a08a9 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -475,6 +475,10 @@ ENUM_CODEGENOPT(ZeroCallUsedRegs, ZeroCallUsedRegsKind,
/// non-deleting destructors. (No effect on Microsoft ABI.)
CODEGENOPT(CtorDtorReturnThis, 1, 0, Benign)
+/// Set via -Xclang -sloppy-temporary-lifetimes to disable emission of lifetime
+/// marker intrinsic calls.
+CODEGENOPT(NoLifetimeMarkersForTemporaries, 1, 0, Benign)
+
/// Enables emitting Import Call sections on supported targets that can be used
/// by the Windows kernel to enable import call optimization.
CODEGENOPT(ImportCallOptimization, 1, 0, Benign)
diff --git a/clang/include/clang/Options/Options.td b/clang/include/clang/Options/Options.td
index 756d6deed7130..b5ed93d5a4d39 100644
--- a/clang/include/clang/Options/Options.td
+++ b/clang/include/clang/Options/Options.td
@@ -8135,6 +8135,10 @@ def import_call_optimization : Flag<["-"], "import-call-optimization">,
def replaceable_function: Joined<["-"], "loader-replaceable-function=">,
MarshallingInfoStringVector<CodeGenOpts<"LoaderReplaceableFunctionNames">>;
+def sloppy_temporary_lifetimes : Flag<["-"], "sloppy-temporary-lifetimes">,
+ HelpText<"Don't emit lifetime markers for temporary objects">,
+ MarshallingInfoFlag<CodeGenOpts<"NoLifetimeMarkersForTemporaries">>;
+
} // let Visibility = [CC1Option]
//===----------------------------------------------------------------------===//
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index efacb3cc04c01..92fe954fddcff 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -43,6 +43,7 @@
#include "llvm/IR/IntrinsicInst.h"
#include "llvm/IR/Intrinsics.h"
#include "llvm/IR/Type.h"
+#include "llvm/Support/TypeSize.h"
#include "llvm/Transforms/Utils/Local.h"
#include <optional>
using namespace clang;
@@ -4947,7 +4948,23 @@ void CodeGenFunction::EmitCallArg(CallArgList &args, const Expr *E,
return;
}
- args.add(EmitAnyExprToTemp(E), type);
+ AggValueSlot ArgSlot = AggValueSlot::ignored();
+ // If the callee returns a reference, skip this stack saving optimization;
+ // we don't want to prematurely end the lifetime of the temporary. It may be
+ // possible to still perform this optimization if the return type is a
+ // reference to a different type than the parameter.
+ if (hasAggregateEvaluationKind(E->getType())) {
+ RawAddress ArgSlotAlloca = Address::invalid();
+ ArgSlot = CreateAggTemp(E->getType(), "agg.tmp", &ArgSlotAlloca);
+
+ // Emit a lifetime start/end for this temporary at the end of the full
+ // expression.
+ if (!CGM.getCodeGenOpts().NoLifetimeMarkersForTemporaries &&
+ EmitLifetimeStart(ArgSlotAlloca.getPointer()))
+ pushFullExprCleanup<CallLifetimeEnd>(NormalAndEHCleanup, ArgSlotAlloca);
+ }
+
+ args.add(EmitAnyExpr(E, ArgSlot), type);
}
QualType CodeGenFunction::getVarArgType(const Expr *Arg) {
diff --git a/clang/test/CodeGen/lifetime-call-temp.c b/clang/test/CodeGen/lifetime-call-temp.c
new file mode 100644
index 0000000000000..3bc68b5e8024a
--- /dev/null
+++ b/clang/test/CodeGen/lifetime-call-temp.c
@@ -0,0 +1,98 @@
+// RUN: %clang -cc1 -triple x86_64-apple-macos -O1 -disable-llvm-passes %s \
+// RUN: -emit-llvm -o - | FileCheck %s --implicit-check-not=llvm.lifetime
+// RUN: %clang -cc1 -xc++ -std=c++17 -triple x86_64-apple-macos -O1 \
+// RUN: -disable-llvm-passes %s -emit-llvm -o - -Wno-return-type-c-linkage | \
+// RUN: FileCheck %s --implicit-check-not=llvm.lifetime --check-prefixes=CHECK,CXX
+// RUN: %clang -cc1 -xobjective-c -triple x86_64-apple-macos -O1 \
+// RUN: -disable-llvm-passes %s -emit-llvm -o - | \
+// RUN: FileCheck %s --implicit-check-not=llvm.lifetime --check-prefixes=CHECK,OBJC
+// RUN: %clang -cc1 -triple x86_64-apple-macos -O1 -disable-llvm-passes %s \
+// RUN: -emit-llvm -o - -sloppy-temporary-lifetimes | \
+// RUN: FileCheck %s --implicit-check-not=llvm.lifetime --check-prefixes=SLOPPY
+
+typedef struct { int x[100]; } aggregate;
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+void takes_aggregate(aggregate);
+aggregate gives_aggregate();
+
+// CHECK-LABEL: define void @t1
+void t1() {
+ takes_aggregate(gives_aggregate());
+
+ // CHECK: [[AGGTMP:%.*]] = alloca %struct.aggregate, align 8
+ // CHECK: call void @llvm.lifetime.start.p0(ptr [[AGGTMP]])
+ // CHECK: call void{{.*}} @gives_aggregate(ptr{{.*}}sret(%struct.aggregate) align 4 [[AGGTMP]])
+ // CHECK: call void @takes_aggregate(ptr noundef byval(%struct.aggregate) align 8 [[AGGTMP]])
+ // CHECK: call void @llvm.lifetime.end.p0(ptr [[AGGTMP]])
+
+ // SLOPPY: [[AGGTMP:%.*]] = alloca %struct.aggregate, align 8
+ // SLOPPY-NEXT: call void (ptr, ...) @gives_aggregate(ptr{{.*}}sret(%struct.aggregate) align 4 [[AGGTMP]])
+ // SLOPPY-NEXT: call void @takes_aggregate(ptr noundef byval(%struct.aggregate) align 8 [[AGGTMP]])
+}
+
+// CHECK: declare {{.*}}llvm.lifetime.start
+// CHECK: declare {{.*}}llvm.lifetime.end
+
+#ifdef __cplusplus
+// CXX: define void @t2
+void t2() {
+ struct S {
+ S(aggregate) {}
+ };
+ S{gives_aggregate()};
+
+ // CXX: [[AGG:%.*]] = alloca %struct.aggregate
+ // CXX: call void @llvm.lifetime.start.p0(ptr [[AGG]]
+ // CXX: call void @gives_aggregate(ptr{{.*}}sret(%struct.aggregate) align 4 [[AGG]])
+ // CXX: call void @_ZZ2t2EN1SC1E9aggregate(ptr {{.*}}, ptr {{.*}} byval(%struct.aggregate) align 8 [[AGG]])
+ // CXX: call void @llvm.lifetime.end.p0(ptr [[AGG]]
+}
+
+struct Dtor {
+ ~Dtor();
+};
+
+void takes_dtor(Dtor);
+Dtor gives_dtor();
+
+// CXX: define void @t3
+void t3() {
+ takes_dtor(gives_dtor());
+
+ // CXX: [[AGG:%.*]] = alloca %struct.Dtor
+ // CXX: call void @llvm.lifetime.start.p0(ptr [[AGG]])
+ // CXX: call void @gives_dtor(ptr{{.*}}sret(%struct.Dtor) align 1 [[AGG]])
+ // CXX: call void @takes_dtor(ptr noundef [[AGG]])
+ // CXX: call void @_ZN4DtorD1Ev(ptr {{.*}} [[AGG]])
+ // CXX: call void @llvm.lifetime.end.p0(ptr [[AGG]])
+ // CXX: ret void
+}
+
+#endif
+
+#ifdef __OBJC__
+
+@interface X
+-m:(aggregate)x;
+@end
+
+// OBJC: define void @t4
+void t4(X *x) {
+ [x m: gives_aggregate()];
+
+ // OBJC: [[AGG:%.*]] = alloca %struct.aggregate
+ // OBJC: call void @llvm.lifetime.start.p0(ptr [[AGG]]
+ // OBJC: call void{{.*}} @gives_aggregate(ptr{{.*}}sret(%struct.aggregate) align 4 [[AGGTMP]])
+ // OBJC: call {{.*}}@objc_msgSend
+ // OBJC: call void @llvm.lifetime.end.p0(ptr [[AGG]]
+}
+
+#endif
+
+#ifdef __cplusplus
+}
+#endif
diff --git a/clang/test/CodeGen/stack-usage-lifetimes.c b/clang/test/CodeGen/stack-usage-lifetimes.c
new file mode 100644
index 0000000000000..3787a29e4ce7d
--- /dev/null
+++ b/clang/test/CodeGen/stack-usage-lifetimes.c
@@ -0,0 +1,89 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog %s -verify=x86-precise
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog -sloppy-temporary-lifetimes %s -verify=x86-sloppy
+
+// RUN: %clang_cc1 -triple aarch64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog %s -verify=aarch64-precise
+// RUN: %clang_cc1 -triple aarch64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog -sloppy-temporary-lifetimes %s -verify=aarch64-sloppy
+
+// RUN: %clang_cc1 -triple riscv64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog %s -verify=riscv-precise
+// RUN: %clang_cc1 -triple riscv64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog -sloppy-temporary-lifetimes %s -verify=riscv-sloppy
+
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog %s -verify=x86-precise -xc++
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog -sloppy-temporary-lifetimes %s -verify=x86-sloppy -xc++
+
+// RUN: %clang_cc1 -triple aarch64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog %s -verify=aarch64-precise -xc++
+// RUN: %clang_cc1 -triple aarch64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog -sloppy-temporary-lifetimes %s -verify=aarch64-sloppy -xc++
+
+// RUN: %clang_cc1 -triple riscv64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog %s -verify=riscv-precise -xc++
+// RUN: %clang_cc1 -triple riscv64-unknown-linux-gnu -O1 -emit-codegen-only -Rpass-analysis=prologepilog -sloppy-temporary-lifetimes %s -verify=riscv-sloppy -xc++
+
+
+typedef struct { char x[32]; } A;
+typedef struct { char *w, *x, *y, *z; } B;
+
+void useA(A);
+void useB(B);
+A genA(void);
+B genB(void);
+
+void t1(int c) {
+ // x86-precise-remark@-1 {{40 stack bytes}}
+ // x86-sloppy-remark@-2 {{72 stack bytes}}
+ // aarch64-precise-remark@-3 {{48 stack bytes}}
+ // aarch64-sloppy-remark@-4 {{80 stack bytes}}
+ // riscv-precise-remark@-5 {{48 stack bytes}}
+ // riscv-sloppy-remark@-6 {{80 stack bytes}}
+
+ if (c)
+ useA(genA());
+ else
+ useA(genA());
+}
+
+void t2(void) {
+ // x86-precise-remark@-1 {{72 stack bytes}}
+ // x86-sloppy-remark@-2 {{72 stack bytes}}
+ // aarch64-precise-remark@-3 {{80 stack bytes}}
+ // aarch64-sloppy-remark@-4 {{80 stack bytes}}
+ // riscv-precise-remark@-5 {{80 stack bytes}}
+ // riscv-sloppy-remark@-6 {{80 stack bytes}}
+
+ useA(genA());
+ useA(genA());
+}
+
+void t3(void) {
+ // x86-precise-remark@-1 {{72 stack bytes}}
+ // x86-sloppy-remark@-2 {{72 stack bytes}}
+ // aarch64-precise-remark@-3 {{80 stack bytes}}
+ // aarch64-sloppy-remark@-4 {{80 stack bytes}}
+ // riscv-precise-remark@-5 {{80 stack bytes}}
+ // riscv-sloppy-remark@-6 {{80 stack bytes}}
+
+ useB(genB());
+ useB(genB());
+}
+
+#ifdef __cplusplus
+struct C {
+ char x[24];
+ char *ptr;
+ ~C() {};
+};
+
+void useC(C);
+C genC(void);
+
+// This case works in C++, since its AST is structured slightly differently
+// than it is in C (CompundStmt/ExprWithCleanup/CallExpr vs CompundStmt/CallExpr).
+void t4() {
+ // x86-precise-remark@-1 {{40 stack bytes}}
+ // x86-sloppy-remark@-2 {{72 stack bytes}}
+ // aarch64-precise-remark@-3 {{48 stack bytes}}
+ // aarch64-sloppy-remark@-4 {{80 stack bytes}}
+ // riscv-precise-remark@-5 {{48 stack bytes}}
+ // riscv-sloppy-remark@-6 {{80 stack bytes}}
+
+ useC(genC());
+ useC(genC());
+}
+#endif
diff --git a/clang/test/CodeGenCXX/amdgcn-call-with-aggarg.cc b/clang/test/CodeGenCXX/amdgcn-call-with-aggarg.cc
new file mode 100644
index 0000000000000..9b598a48f6436
--- /dev/null
+++ b/clang/test/CodeGenCXX/amdgcn-call-with-aggarg.cc
@@ -0,0 +1,19 @@
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -emit-llvm -O3 -disable-llvm-passes -o - %s | FileCheck %s
+
+struct A {
+ float x, y, z, w;
+};
+
+void foo(A a);
+
+// CHECK-LABEL: @_Z4testv
+// CHECK: [[A:%.*]] = alloca [[STRUCT_A:%.*]], align 4, addrspace(5)
+// CHECK-NEXT: [[AGG_TMP:%.*]] = alloca [[STRUCT_A]], align 4, addrspace(5)
+// CHECK-NEXT: [[A_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[A]] to ptr
+// CHECK-NEXT: [[AGG_TMP_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[AGG_TMP]] to ptr
+// CHECK-NEXT: call void @llvm.lifetime.start.p5(i64 16, ptr addrspace(5) [[A]]) #[[ATTR4:[0-9]+]]
+// CHECK-NEXT: call void @llvm.lifetime.start.p5(i64 16, ptr addrspace(5) [[AGG_TMP]]) #[[ATTR4]]
+void test() {
+ A a;
+ foo(a);
+}
diff --git a/clang/test/CodeGenCXX/stack-reuse-miscompile.cpp b/clang/test/CodeGenCXX/stack-reuse-miscompile.cpp
index 67fa9f9c9cd98..50c374d2710f4 100644
--- a/clang/test/CodeGenCXX/stack-reuse-miscompile.cpp
+++ b/clang/test/CodeGenCXX/stack-reuse-miscompile.cpp
@@ -26,6 +26,8 @@ const char * f(S s)
// CHECK: [[T2:%.*]] = alloca %class.T, align 4
// CHECK: [[T3:%.*]] = alloca %class.T, align 4
//
+// CHECK: [[AGG:%.*]] = alloca %class.S, align 4
+//
// FIXME: We could defer starting the lifetime of the return object of concat
// until the call.
// CHECK: call void @llvm.lifetime.start.p0(ptr [[T1]])
@@ -34,10 +36,12 @@ const char * f(S s)
// CHECK: [[T4:%.*]] = call noundef ptr @_ZN1TC1EPKc(ptr {{[^,]*}} [[T2]], ptr noundef @.str)
//
// CHECK: call void @llvm.lifetime.start.p0(ptr [[T3]])
+// CHECK: call void @llvm.lifetime.start.p0(ptr [[AGG]])
// CHECK: [[T5:%.*]] = call noundef ptr @_ZN1TC1E1S(ptr {{[^,]*}} [[T3]], [2 x i32] %{{.*}})
//
// CHECK: call void @_ZNK1T6concatERKS_(ptr dead_on_unwind writable sret(%class.T) align 4 [[T1]], ptr {{[^,]*}} [[T2]], ptr noundef nonnull align 4 dereferenceable(16) [[T3]])
// CHECK: [[T6:%.*]] = call noundef ptr @_ZNK1T3strEv(ptr {{[^,]*}} [[T1]])
+// CHECK: call void @llvm.lifetime.end.p0(ptr [[AGG]])
//
// CHECK: call void @llvm.lifetime.end.p0(
// CHECK: call void @llvm.lifetime.end.p0(
diff --git a/clang/test/CodeGenCoroutines/pr59181.cpp b/clang/test/CodeGenCoroutines/pr59181.cpp
index 21e784e0031de..a68a61984f981 100644
--- a/clang/test/CodeGenCoroutines/pr59181.cpp
+++ b/clang/test/CodeGenCoroutines/pr59181.cpp
@@ -49,6 +49,7 @@ void foo() {
}
// CHECK: cleanup.cont:{{.*}}
+// CHECK-NEXT: call void @llvm.lifetime.start.p0(ptr [[AGG:%agg.tmp]])
// CHECK-NEXT: load i8
// CHECK-NEXT: trunc
// CHECK-NEXT: store i1 false
@@ -57,3 +58,6 @@ void foo() {
// CHECK-NOT: call void @llvm.lifetime
// CHECK: call void @llvm.coro.await.suspend.void(
// CHECK-NEXT: %{{[0-9]+}} = call i8 @llvm.coro.suspend(
+
+// CHECK-LABEL: cond.end:
+// check call @llvm.lifetime.end.p0(ptr [[AGG]])
|
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
|
Do note that |
🐧 Linux x64 Test Results
✅ The build succeeded and all tests passed. |
clang/lib/CodeGen/CGCall.cpp
Outdated
|
|
||
| args.add(EmitAnyExprToTemp(E), type); | ||
| AggValueSlot ArgSlot = AggValueSlot::ignored(); | ||
| // If the callee returns a reference, skip this stack saving optimization; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what "If the callee returns a reference" means, in this context.
The only way the address of the temporary created here could be user-visible is if we use it as the parameter address in the callee.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, just went through D74094... I think I understand the context, but those cases seem to just be use-after-free.
In any case, this initial patch should be conservative enough that it's unlikely to cause issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was brought up in https://reviews.llvm.org/D74094. In a complex expression, a reference to the temporary could be returned (e.g. passed in via a parameter and back out via return) and which is then used in another part of the expression.
I left that logic (and comment) in place from the original patch, since it seems to be to be conservative w.r.t. preserving the status quo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There isn't any logic here that checks whether the callee returns a reference. And such a check wouldn't be useful, anyway; we can't catch all the ways a user could stash away a pointer.
I think the correct explanation is something like this:
// For arguments with aggregate type, create an alloca to store
// the value. If the argument's type has a destructor, that destructor
// will run at the end of the full-expression; emit matching lifetime
// markers.
//
// FIXME: For types which don't have a destructor, consider using a
// narrower lifetime bound.
And then #170518 implements the FIXME.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that is true looking more closely. I've updated the comment per your suggestion.
b4725eb to
e0f9bcc
Compare
efriedma-quic
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with one minor comment.
clang/lib/CodeGen/CGCall.cpp
Outdated
| #include "llvm/IR/IntrinsicInst.h" | ||
| #include "llvm/IR/Intrinsics.h" | ||
| #include "llvm/IR/Type.h" | ||
| #include "llvm/Support/TypeSize.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary include.
08189ad to
564e13b
Compare
We have several issues describing suboptimal stack usage related to the lifetimes of temporary objects, such as #68747, #43598, and #109204. Previously, https://reviews.llvm.org/D74094 tried to address this. In that review, a few issues were brought up, particularly a concern about the lifetimes of the temporaries needing to be extended to end of the full expression. While there are arguably more optimal lifetime bounds we could enforce, for now we can conservatively make them extend to the end of the full expression, and later refine the optimization to use tighter bounds (or perhaps a better mechanism in the middle end?). Fixes #68747 Co-authored-by: Nick Desaulniers <[email protected]> Co-authored-by: Erik Pilkington <[email protected]>
|
This seems to be tickling something in the dfsan tests. I’m not totally sure why. Perhaps with the lifetime in the eh path something can’t be DCEd anymore? I need to investigate if this is a real problem or a missing flag or something in the dfsan test. It’s also odd that only the single test fails. |
clang/lib/CodeGen/CGCall.cpp
Outdated
| // expression. | ||
| if (!CGM.getCodeGenOpts().NoLifetimeMarkersForTemporaries && | ||
| EmitLifetimeStart(ArgSlotAlloca.getPointer())) | ||
| pushFullExprCleanup<CallLifetimeEnd>(NormalAndEHCleanup, ArgSlotAlloca); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe using NormalEHLifetimeMarker would make a difference? We have some special case checks for cleanups that are just lifetime markers, to simplify the generated code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I just figured that out before reading your comment. That was exactly the issue. The other cleanup kind turned some iterator uses from calls to invokes, and that required a personality function, which would need to be a dfsan instrumented/compatible version. I didn't see this locally, since we build all the various flavors of sanitizer runtimes + libc++/libunwind combinations in our toolchain distribution.
17a82ce to
15c1888
Compare

We have several issues describing suboptimal stack usage related to the
lifetimes of temporary objects, such as #68747, #43598, and #109204.
Previously, https://reviews.llvm.org/D74094 tried to address this. In
that review, a few issues were brought up, particularly a concern about
the lifetimes of the temporaries needing to be extended to end of the
full expression. While there are arguably more optimal lifetime bounds
we could enforce, for now we can conservatively make them extend to the
end of the full expression, and later refine the optimization to use
tighter bounds (or perhaps a better mechanism in the middle end?).
Fixes #68747
Co-authored-by: Nick Desaulniers [email protected]
Co-authored-by: Erik Pilkington [email protected]